# MLflow Deployments for LLMs

>[The MLflow Deployments for LLMs](https://www.mlflow.org/docs/latest/llms/deployments/index.html) is a powerful tool designed to streamline the usage and management of various large
> language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface
> that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests.

## Installation and Setup

Install `mlflow` with MLflow Deployments dependencies:

```sh
pip install 'mlflow[genai]'
```

Set the OpenAI API key as an environment variable:

```sh
export OPENAI_API_KEY=...
```

Create a configuration file:

```yaml
endpoints:
  - name: completions
    endpoint_type: llm/v1/completions
    model:
      provider: openai
      name: text-davinci-003
      config:
        openai_api_key: $OPENAI_API_KEY

  - name: embeddings
    endpoint_type: llm/v1/embeddings
    model:
      provider: openai
      name: text-embedding-ada-002
      config:
        openai_api_key: $OPENAI_API_KEY
```

Start the deployments server:

```sh
mlflow deployments start-server --config-path /path/to/config.yaml
```

## Example provided by `MLflow`

>The `mlflow.langchain` module provides an API for logging and loading `LangChain` models.
> This module exports multivariate LangChain models in the langchain flavor and univariate LangChain
> models in the pyfunc flavor.

See the [API documentation and examples](https://www.mlflow.org/docs/latest/python_api/mlflow.langchain) for more information.

## Completions Example

```python
import mlflow
from langchain.chains import LLMChain, PromptTemplate
from langchain.llms import Mlflow

llm = Mlflow(
    target_uri="http://127.0.0.1:5000",
    endpoint="completions",
)

llm_chain = LLMChain(
    llm=Mlflow,
    prompt=PromptTemplate(
        input_variables=["adjective"],
        template="Tell me a {adjective} joke",
    ),
)
result = llm_chain.run(adjective="funny")
print(result)

with mlflow.start_run():
    model_info = mlflow.langchain.log_model(chain, "model")

model = mlflow.pyfunc.load_model(model_info.model_uri)
print(model.predict([{"adjective": "funny"}]))
```

## Embeddings Example

```python
from langchain.embeddings import MlflowEmbeddings

embeddings = MlflowEmbeddings(
    target_uri="http://127.0.0.1:5000",
    endpoint="embeddings",
)

print(embeddings.embed_query("hello"))
print(embeddings.embed_documents(["hello"]))
```

## Chat Example

```python
from langchain.chat_models import ChatMlflow
from langchain.schema import HumanMessage, SystemMessage

chat = ChatMlflow(
    target_uri="http://127.0.0.1:5000",
    endpoint="chat",
)

messages = [
    SystemMessage(
        content="You are a helpful assistant that translates English to French."
    ),
    HumanMessage(
        content="Translate this sentence from English to French: I love programming."
    ),
]
print(chat(messages))
```
